perm filename NCC.TEX[AM,DBL] blob sn#581011 filedate 1981-04-29 generic text, type T, neo UTF8
~j(635)\f5
According to the outline Randy provided me with last week, we have come to that section of the talk where you are to be inundated with data, with case studies.  I think the plan is that after this you are supposed to welcome the vague generalizations with which Randy will close the talk. ~j\f5
~j\f5
To describe an expert system to you, I have to tell you a little about the task it's performing, the way in which it represents knowledge, and the way in which it uses that knowledge.  To qualify as an "expert system", the task should have high utility, usually to practitioners of some scientific craft, such as X-Ray crystallographers, or Mass spectroscopists, or Physiciians, or Computer Scientists.  Besides utility the system must have power -- in short, it must work.  High performance is a hallmark of expert systems.~j\f5 75i4I84i4I78i7I160i7I22i5I22bi4B1I
~j\f5
But high performance demands a great amount of domain-dependent expertise, and that takes us to the next two points: representing and using expert knowledge.  We want the knowledge to be both transparent (understandable) and powerful; unfortunately these two concerns are usually at odds with each other, and the result is that we have a collection of representations for knowledge, each one being a compromise between understandability and efficiency.~j\f5 140b16B
~j\f5
In the real world, things are a bit worse than that, and we have to consider a fourth category: this is usually euphemistically called "New Ideas", but what it really is is New Complications:  the problems that arose and had to be overcome in the course of getting the thing to work.  Here I've listed two ubiquitous such concerns: how di the knowledge get extracted from the expert's skull and inserted into an 18-bit address space machine? and second: How can we measure the system's performance?  Performance is a slippery quantity: it has many dimensions: How close does the system come to duplicating the answers given by domain experts? How closely does its line of reasoning parallel theirs? What is the range or versatility of the system? and so on. These are by no means the only two kinds of Complications -- we'll see plenty more in the next few minutes.~j\f5 664i17I
~j\f5
There have been many programs, even as far back as the early 60's, that would meet our criteria for expert systems.  For each one of them, I could go over the task they performed, the kind of knowledge that was stored, how it was stored and used, etc.~j\f5 100i15I
~j\f5
But I won't.~j\f5
~j\f5
Suffice it to say that we learned a lot from those pioneering efforts.  Much of it seems like common sense now, almost trivial in hindsight, but it's worth going over, and it's worth remembering that this wasn't obvious twenty years ago:
~j\f5
First, there's the notion that you need to capture the knowledge that the human expert is employing when he or she carrie sout the task.  That means representing the facts, the operations, the terms and concepts they use.  But it means more:  a great amount of their expertise is informal, judgmental rules, heuristics, and an expert system is going to sink or swim deopending on whehter you do or don't capture the expert's heuristic knowledge as well as his factual, textbook-type knowledge.  Heuristics are the uncertain, experiential knowledge the expert rarely articulates; it's the knowledge of "good guessing" and "good practice" that gets picked up during internships, and graduate studenthoods, and apprenticeships.~j\f5 280i26I2i10I
~j\f5
Second, we realized that you can -- and should -- separate out the detailed domain knowledge from the general problem-solving techniques.  The latter stay pretty much constant from task to task, domain to domain, and you can build a relatively simple control structure that incorporates them.  It's the former, the detailed domain-dependent expertise, that you want to focus on, for that's where the real power lies.~j\f5
~j\f5
Finally, you have to be sensitive to the whole environment in which the program is going to be used ultimately.  Maybe that extra factor of 2 isn't as important as a little bit more human engineering:  make the program understandable, keep the experts interested, make the ultimate users comfortable.  Have the program maintain an explicit model of its line of reasoning on a problem, and be able to report it to a user in something approximating simple English sentences. ~j\f5
~j\f5
Given all that, let's move up to the last five years or so.  Here are programs which were built with full knowledge of those early lessons.  We'll take a closer look at some of them, like CASNET, MYCIN, LDS, and AM.  But the list isn't complete: <<>> in fact, there are two entirely new categories of expert systems: ones that work with the field of computer hardware and software itself, and a whole new breed of programs halfway between very high level languages and expert systems: they are programs which automatically acquire new knowledge, and among other things can help an expert who is trying to build an expert system.  We'll look in more detail at Browser, R1, and Teiresias, in a minute.~j\f5
~j\f5
The first system we'll look at is MYCIN, a medical consultant written by Ted Shorliffe of the Stanford HPP around 1975.  MYCIN is one of the world's experts at diagnosing and prescribing for meningitis and some kinds of bacteremia.  It consists of a knowledge base of rules, IF/THEN statements, and a very simple interpreter which exhaustively works backwards through them, in a depth-first search.~j\f5
~j\f5
It's probably worth going through the trace of an actual run of MYCIN.  The physician enters the name of the patient, and replies to various questions that MYCIN asks.  There is no fixed order for these questions. Rather, there is just this collection of IF/THEN rules, and as some questions get answered, they determine a "current state", which triggers some more rules -- makes their IF parts true. Those rules fire, their THEN parts are executed, and that in turn causes some new questions to be asked, and the cycle repeats.~j\f5
~j\f5
Here (28-34) we see detailed questions about the CSF -- cerebrospinal fluid that was optained via a lumbar puncture.  Finally, MYCIN concludes that the patient probably does have viral meningitis, and enters its second pahse, deciding on the most appropriate treatment.  It recomments repeating the LP -- spinal tap, and treating with Ampicillin and Chloramphenicol, and keeping that up for 10 days.~j\f5
~j\f5
This patient was run through MYCIN soon after admission to Stanford Hospital, and was given the treatment specified by the program.  Subsequent lab tests bore out its hypotheses, and the prescribed ten days were up last Monday, and Nicole was released.   With a disease such as meningitis, a delay of even a day can lead to brain damage or death, and that's one of the factors which makes this task of such high utility.  The reason I'm dwelling on this case is that Nicole L is Nicole Lenat, my daughter, and I'm glad to report she's doing fine now. ~j\f5
~j\f5
The validation of a program depends on more than one anecdote, and a large survey was done on MYCIN recently, to evaluate its performance.  As you see, in the eyes of 8 independent experts, MYCIN did about as well as several of the medical faculty, and significantly better than the residents and students.~j\f5
~j\f5
But recall that Performance means more than just getting the right answer.  MYCIN is explicitly building up a line of reasoning for how it reaches its answer.  After the consultation on Nicole, Ted Shortliffe asked the system questions to find out why it came to the conclusions it did, why not others, why it recommended what it did, and so forth.  At one point this meant typing out a few critical rules.  Clause (d) of this rule indicates why MYCIN suspected that Nicole might have hemo'philus influenzae.~j\f5 16i11I83i17I
~j\f5
This interaction is not really as graceful as one might hope for, and a 1976 thesis by Randy Davis enlarged the scope of what MYCIN -- and expert systems -- could do in the way of helping to debug and acquire new rules.  Teiresias can almost be considered a Meta-MYCIN, a program which contains knowledge about MYCIN rules, such as this meta-rule.  But Teiresias is more -- it can build up entirely new models dynamically, and that kind of learning capability is going to be necessary as we expand the scope of what an individual expert system works on.~j\f5
~j\f5
Another intersting wrinkle in the MYCIN story is GUIDON: here the idea is to use the MYCIN knowledge base, to drive a CAI program, a tutor.  The Guidon dialogue looks a lot like MYCIN's, except now it's the program asking the questions and the student providing the answers.  This is another way to evaluate or validate MYCIN -- to show that its knowledge base can be used for a very different purporse then the one in which it was designed.~j\f5 207i7I30i7I48i8I
~j\f5
Now we come to a very remarkable story, the PUFF story.  The story starts out ordinarily enough: a patient is experiencing breathing difficulties; he comes to Pacific Medical Center, in SFO, and they have him breathe in and out of a tube connected to an instrument which is linked to a computer.   The resultant data about flow rates and volumes is processed and ready for the diagnostician to interpret.  Instead of having the human do this job, the PUFF program is supposed to take this kind of data and produce polished interpretations like this one.   One hundred cases were analyzed to produce PUFF's 55 rules.  On an additional 150 cases, PUFF's findings agreed with the experts in over 90% of the cases.  In fact, PUFF is now in routine use at the Pacific Medical Center, and the doctors there frequently just tear off the printout reports PUFF generates and sign them.  ~j\f5
~j\f5
What's remarkable is that there is no fourth line -- no serious complications.  This system was built in 50 hours of interaction with the medical expert, followed by a couple months of effort by knowledge engineers, working to enter that data into a program.~j\f5
~j\f5
Those 55 rules are really just the tip of the iceberg, as it were: PUFF uses much of the MYCIN program as its underlying representation and control structure -- in fact, all we had to do was remove the Meningitis rules and add the Obstructive Airways Disease rules.  Here's a typical PUFF rule; you can see that it's set up the same -- in format and intended use -- as the mYCIN rules we saw before.~j\f5
~j\f5
If you will forgive a generalization to creep into this part of the talk, I will cite the general trend to take a specific expert system, excise the knowledge base, and -- presto -- what you have left is a tool you can put on your knowledge engineer's workbench, a vessel already tried and true, waiting only rules from some new domain to be put into it.~j\f5
~j\f5
Several MYCIN spinoffs have been quickly and successfully built.  ~j\f5
~j\f5
Similarly, Waterman's Legal Decisionmaking System -- LDS -- was built on top of an earlier system, ROSIE.  I won't go through even one typical rule for LDS -- there are over 13 clauses I omitted here -- remember it is a legal expert.~j\f5 220i5I
~j\f5
Consider now CASNET, Casimir Kulikowsky's medical consultant system.  It has grown into EXPET, with Weiss' help, a general vehicle much like EMYCIN.  EXPERT distinguishes ground-level findings or observations from higher-level conjectures or hypotheses.  Therefore there are different types of rules.  This first one connects two findings: in English it would say that if your patient is a male, then he's not pregnant.  That conclusion is more than a mere hypothesis.  The second rule is merely conjectural -- palpitations of the heart and tremors in the fingers suggest hyperthyroidism.  And the thrid type of rule is even more abstract.  Certain types of hypothesized conditions suggest others.  EXPERT, as CASNET, grapples with a causal reasining component, and often a complication arises from trying to reconcile it's results with those from the rules.~j\f5 486i2I76i7I
~j\f5
I don't mean to give the impression that expert systems are all medical, and one of the fartheset removed from medicine is my own thesis program, AM.  AM had no precisely defined goal, rather it was armed with concepts and heuristics from finite set theory, and instructed to wander out into that world and do good things.  AM began with 115 basic concepts, represented as Frames -- property or asociation lists of attribute-value pairs.  242 IF/THEN rules were tied to this hierarchy, and the structure of these concepts nduced a structure on the set of AM's heuristic rules.  A typical such rule said...  Intersection->Disjointness,
Divisors-Of -> SetOfNosWith0--3Divisors.   This rule suggests defining the following 5 sets:.  Among them is prime numbers, and certainly we can't say AM has discovered Prime numbers until it finds out why thery're interesting, why they're worth naming.  There are several heuristics -- several of the IF-THEN rules -- in AM which make just such judgments.  Here are four rules which work together. One of them caused AM to gather empirical data about each set, and another caused it to look for connections between members of one set and known kind sof numbers.  Something interesting was found about this fourth set of number, those with precisely 3 divisors.  Namely, they all appear to be perfect squares.  One of the 4 rules said to not pass up a barghain -- in this casew take their square roots.  Doing that revelas them to be precsiely the set aboue them -- numbers with two divisors.  Rule #2 applies again, raising the estimated worth of both numbers with 2 and with 3 divisors.  You get the flavor, I hope. 
~j\f5 522i6I265i10I
AM discovered most of the well-known set-theory relationships, such as de Morgan's laws, and then defined a concept isomorphic to natural numbers, as we see in this excerpt.  Arithmetic quickly followed, and AM found many amusing concepts, and conjectures relating them.~j\f5
~j\f5
One intersting feature about AM was its control: a best-first search, managed by an agenda, a job-queue of plausible little topics to investigate.~j\f5
~j\f5
Having paved the way, AM made it easier to build Browser, by Dankel here at U. Illinois, now at Florida.  Its task was to sift through a data base about, say, airplane crashes, and do the same AM-like open-ended concept play, hoping to find some valuable new concepts and conjectures, such as LikelyToCrash.  As with many of the spinoff systems, there is no fourth line here, no new towering unforseen complication.  Browser used the same kind of representation and control structure as AM.  Unlike most spinoffs, though, the work had no tangible tie to AM -- Dankel worked from articles about AM rather than physically taking the AM code and excising the math knowledge.~j\f5
~j\f5
Another expert system whose dfomain is computers themselves is R1, by John McDermott at CMU.  The idea here is to aid a potential DEC customer in configuring a VAX-11/780 system, often entailing him purchasing additonal equipment he hadn't known he'd need.  It's a lot better to find that out at ordering time than six months later.  R1 is built on top of OPS4, and no dramatic complication arose.  In fact, a prototype was running after just a few months of John's consulting time had been expended.  The actual R1 syntax is quite arcane, so here's what one rule would look like in English.~j\f5
~j\f5
I don't have time now to go into more detail about any of these systems, so let me steal a bit of Randy's thunder by drawing a few conclusions from the data.~j\f5
~j\f5
The expert systems we've seen so far have from 50-500 rules -- not 1 or 2, and not 10,000.  They draw a lot of their power from the easy extensibility of their knowledge bases.  If MYCIN were coded like this, it would be nigh onto imposible to add a new bit of knowledge.  As it stands, the physician simply tells MYCIN a new rule -- he needn't know where to put it, he needn't know what other rules exist in the system.  If a new technique or drug becomes avaialable, it can be entered without regard to the rest of the program.  That's important because no matter how simple you think a problem is, (MONKEY) there's always the possibility that your knowledge is incomplete.  If you build a program with interdependencies, and it grows to a huge size, it becomes increasingly difficult to modify successfully.~j\f5 308i5I
~j\f5
We could sum those up by saying use a uniform representation for knowledge, divide it up into chunks that seem about the rize size of granularity to the domain expert, and don't worry overmuch about the inferecne engine -- the control structure.  A generate&test, like most of the systems used, or best-first search like AM, or match&accrete like R1, will work fine.  Building on the work of others -- using tools from the KE Workbench -- representation languages, is one of the biggest wins we've seen for avoaiding unforseen complications.~j\f5
~j\f5
One of the features of recent expert systems is their ability to deal with uncertainty and contriction.  Uncertainty and error can creep in anywhere, and we don't want a system to simply work for 3 hours and then print out FALSE.  Just as with hardware, the solution lies in redundancy.  In AM's case, there were actually more than one possible goal -- when faced with a problem, AM aborted the task it was working on, and blithely went on to another.  Aside from math research, we can't ususally choose to do that, but we can make our knowledge bases redundnat -- have several reasoning paths that would lead to the same answer.~j\f5 275i10I60i4I
~j\f5
The State of the Art in Expert System's is the subject of the final part of this session, and though there's no time to cover it now, I wanted to at least expose you to my view of where we're at in 1981.  I guess I'm pessimistic at heart; if you want to look ont he bright side, replace all those "Limited"s by "Areas of Current Research". Randall, we're ready for that snappy conclusion now! ~j\f5
~j\f5
~j\f5
~j\f5
~j\f5
~j\f5